Empirical Evaluation of Semi-automated XML Annotation of Text Documents with the GoldenGATE Editor
Identifieur interne : 000E59 ( Main/Exploration ); précédent : 000E58; suivant : 000E60Empirical Evaluation of Semi-automated XML Annotation of Text Documents with the GoldenGATE Editor
Auteurs : Guido Sautter [Allemagne] ; Klemens Böhm [Allemagne] ; Frank Padberg [Allemagne] ; Walter Tichy [Allemagne]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2007.
Abstract
Abstract: Digitized scientific documents should be marked up according to domain-specific XML schemas, to make maximum use of their content. Such markup allows for advanced, semantics-based access to the document collection. Many NLP applications have been developed to support automated annotation. But NLP results often are not accurate enough; and manual corrections are indispensable. We therefore have developed the GoldenGATE editor, a tool that integrates NLP applications and assistance features for manual XML editing. Plain XML editors do not feature such a tight integration: Users have to create the markup manually or move the documents back and forth between the editor and (mostly command line) NLP tools. This paper features the first empirical evaluation of how users benefit from such a tight integration when creating semantically rich digital libraries. We have conducted experiments with humans who had to perform markup tasks on a document collection from a generic domain. The results show clearly that markup editing assistance in tight combination with NLP functionality significantly reduces the user effort in annotating documents.
Url:
DOI: 10.1007/978-3-540-74851-9_30
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 000478
- to stream Istex, to step Curation: 000471
- to stream Istex, to step Checkpoint: 000874
- to stream Main, to step Merge: 000E72
- to stream Main, to step Curation: 000E59
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Empirical Evaluation of Semi-automated XML Annotation of Text Documents with the GoldenGATE Editor</title>
<author><name sortKey="Sautter, Guido" sort="Sautter, Guido" uniqKey="Sautter G" first="Guido" last="Sautter">Guido Sautter</name>
</author>
<author><name sortKey="Bohm, Klemens" sort="Bohm, Klemens" uniqKey="Bohm K" first="Klemens" last="Böhm">Klemens Böhm</name>
</author>
<author><name sortKey="Padberg, Frank" sort="Padberg, Frank" uniqKey="Padberg F" first="Frank" last="Padberg">Frank Padberg</name>
</author>
<author><name sortKey="Tichy, Walter" sort="Tichy, Walter" uniqKey="Tichy W" first="Walter" last="Tichy">Walter Tichy</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:79233416758986A23C3D805E917E2AB681EC3199</idno>
<date when="2007" year="2007">2007</date>
<idno type="doi">10.1007/978-3-540-74851-9_30</idno>
<idno type="url">https://api.istex.fr/document/79233416758986A23C3D805E917E2AB681EC3199/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000478</idno>
<idno type="wicri:Area/Istex/Curation">000471</idno>
<idno type="wicri:Area/Istex/Checkpoint">000874</idno>
<idno type="wicri:doubleKey">0302-9743:2007:Sautter G:empirical:evaluation:of</idno>
<idno type="wicri:Area/Main/Merge">000E72</idno>
<idno type="wicri:Area/Main/Curation">000E59</idno>
<idno type="wicri:Area/Main/Exploration">000E59</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Empirical Evaluation of Semi-automated XML Annotation of Text Documents with the GoldenGATE Editor</title>
<author><name sortKey="Sautter, Guido" sort="Sautter, Guido" uniqKey="Sautter G" first="Guido" last="Sautter">Guido Sautter</name>
<affiliation wicri:level="3"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Department of Computer Science, Universität Karlsruhe (TH), Am Fasanengarten 5, 76128 Karlsruhe</wicri:regionArea>
<placeName><region type="land" nuts="1">Bade-Wurtemberg</region>
<region type="district" nuts="2">District de Karlsruhe</region>
<settlement type="city">Karlsruhe</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Allemagne</country>
</affiliation>
</author>
<author><name sortKey="Bohm, Klemens" sort="Bohm, Klemens" uniqKey="Bohm K" first="Klemens" last="Böhm">Klemens Böhm</name>
<affiliation wicri:level="3"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Department of Computer Science, Universität Karlsruhe (TH), Am Fasanengarten 5, 76128 Karlsruhe</wicri:regionArea>
<placeName><region type="land" nuts="1">Bade-Wurtemberg</region>
<region type="district" nuts="2">District de Karlsruhe</region>
<settlement type="city">Karlsruhe</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Allemagne</country>
</affiliation>
</author>
<author><name sortKey="Padberg, Frank" sort="Padberg, Frank" uniqKey="Padberg F" first="Frank" last="Padberg">Frank Padberg</name>
<affiliation wicri:level="3"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Department of Computer Science, Universität Karlsruhe (TH), Am Fasanengarten 5, 76128 Karlsruhe</wicri:regionArea>
<placeName><region type="land" nuts="1">Bade-Wurtemberg</region>
<region type="district" nuts="2">District de Karlsruhe</region>
<settlement type="city">Karlsruhe</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Allemagne</country>
</affiliation>
</author>
<author><name sortKey="Tichy, Walter" sort="Tichy, Walter" uniqKey="Tichy W" first="Walter" last="Tichy">Walter Tichy</name>
<affiliation wicri:level="3"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Department of Computer Science, Universität Karlsruhe (TH), Am Fasanengarten 5, 76128 Karlsruhe</wicri:regionArea>
<placeName><region type="land" nuts="1">Bade-Wurtemberg</region>
<region type="district" nuts="2">District de Karlsruhe</region>
<settlement type="city">Karlsruhe</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Allemagne</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2007</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">79233416758986A23C3D805E917E2AB681EC3199</idno>
<idno type="DOI">10.1007/978-3-540-74851-9_30</idno>
<idno type="ChapterID">30</idno>
<idno type="ChapterID">Chap30</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: Digitized scientific documents should be marked up according to domain-specific XML schemas, to make maximum use of their content. Such markup allows for advanced, semantics-based access to the document collection. Many NLP applications have been developed to support automated annotation. But NLP results often are not accurate enough; and manual corrections are indispensable. We therefore have developed the GoldenGATE editor, a tool that integrates NLP applications and assistance features for manual XML editing. Plain XML editors do not feature such a tight integration: Users have to create the markup manually or move the documents back and forth between the editor and (mostly command line) NLP tools. This paper features the first empirical evaluation of how users benefit from such a tight integration when creating semantically rich digital libraries. We have conducted experiments with humans who had to perform markup tasks on a document collection from a generic domain. The results show clearly that markup editing assistance in tight combination with NLP functionality significantly reduces the user effort in annotating documents.</div>
</front>
</TEI>
<affiliations><list><country><li>Allemagne</li>
</country>
<region><li>Bade-Wurtemberg</li>
<li>District de Karlsruhe</li>
</region>
<settlement><li>Karlsruhe</li>
</settlement>
</list>
<tree><country name="Allemagne"><region name="Bade-Wurtemberg"><name sortKey="Sautter, Guido" sort="Sautter, Guido" uniqKey="Sautter G" first="Guido" last="Sautter">Guido Sautter</name>
</region>
<name sortKey="Bohm, Klemens" sort="Bohm, Klemens" uniqKey="Bohm K" first="Klemens" last="Böhm">Klemens Böhm</name>
<name sortKey="Bohm, Klemens" sort="Bohm, Klemens" uniqKey="Bohm K" first="Klemens" last="Böhm">Klemens Böhm</name>
<name sortKey="Padberg, Frank" sort="Padberg, Frank" uniqKey="Padberg F" first="Frank" last="Padberg">Frank Padberg</name>
<name sortKey="Padberg, Frank" sort="Padberg, Frank" uniqKey="Padberg F" first="Frank" last="Padberg">Frank Padberg</name>
<name sortKey="Sautter, Guido" sort="Sautter, Guido" uniqKey="Sautter G" first="Guido" last="Sautter">Guido Sautter</name>
<name sortKey="Tichy, Walter" sort="Tichy, Walter" uniqKey="Tichy W" first="Walter" last="Tichy">Walter Tichy</name>
<name sortKey="Tichy, Walter" sort="Tichy, Walter" uniqKey="Tichy W" first="Walter" last="Tichy">Walter Tichy</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000E59 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000E59 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:79233416758986A23C3D805E917E2AB681EC3199 |texte= Empirical Evaluation of Semi-automated XML Annotation of Text Documents with the GoldenGATE Editor }}
This area was generated with Dilib version V0.6.32. |